# A Study of Necessity & Sufficiency of Linear Transformations in The Attention Mehchanism 
Steps below describe how to set up the environment and run the
experiments in the paper. The experiments were run on a computer
with Ubuntu 22.04 operating system, Linux kernel 6.2.0, 128GB
of RAM, and Nvidia RTX 4090 GPU. This GPU consists of 24GB of
memory. If you are using a GPU with less memory, you may need to
reduce the batch size in some of the experiments.

## Setting up the Environment
First, create a Conda environment.

    conda create -n betterattention python==3.11.5

Then, activate the Conda environment using the following
command.

    conda activate betterattention

Finally, move to the directory containing this **README** file
and install the required libraries using the following command.

    pip install -r requirements.txt

## Experiments
To run each of the experiments in the paper, use the commands
provided in the corresponding section below. The experiment on Amazon Review
dataset requires downloading the dataset from the internet
manually. This dataset and its corresponding link are listed
in the table below.

| Dataset        | Download Page                                                        | Path                     |
|----------------|----------------------------------------------------------------------|--------------------------|
| Amazon Reviews | https://figshare.com/articles/dataset/Amazon_Reviews_Full/13232537   | `amazon_review_full_csv` |

The code expects the dataset in the `resources` directory in a
folder named according the **Path** provided in the table
above. Therefore, the expected directory for the CelebA-HQ
dataset is `resources/amazon_review_full_csv`. When you download
the Amazon Review dataset, extracting the contents of the zip file in
the `resources` directory should create a folder named `amazon_review_full_csv`.

Additionally, for the `ImageNet` dataset, we use the 
'tensorflow-datasets' to load the dataset. However, you need to
download the dataset manually first. For this, you need to visit
the original website here: https://www.image-net.org/challenges/LSVRC/2012/2012-downloads.php#images and download
the train and validation splits for ImageNet1K dataset. You need to log in (or
sign up first if you do not have an account) to see the download links in their website. 
Upon agreeing to their terms, the donwload links will be available immediately.
Then, in our `vit_imagenet.py` script, 
you need to change the `cache_dir` variable to the directory 
where you have stored the dataset. 

After running each experiment, the results are stored in a folder
named after that experiment in the `results` directory.

### 1. Transformer for Text Classification

#### 1.1. Transformer on IMDB Reviews Dataset
To run the experiment, use the following command.

    python evaluation/text_classification/transformer_imdb.py

#### 1.2. Transformer on Amazon Reviews Dataset
To run the experiment, use the following command.

    python evaluation/text_classification/transformer_amazon.py

### 2. Vision Transformer for Image Classification

#### 2.1. Vision Transformer on MNIST Dataset
To run the experiment, use the following command.

    python evaluation/image_classification/vit_mnist.py

#### 2.2. Vision Transformer on CIFAR-100 Dataset
To run the experiment, use the following command.

    python evaluation/image_classification/vit_cifar100.py

#### 2.3. Vision Transformer on ImageNet1k Dataset
To run the experiment, use the following command.

    python evaluation/image_classification/vit_imagenet.py

### 3. Transformer for Neural Machine Translation (NMT) 

#### 3.1 Transformer on Europarl and Anki Datasets
To run the experiment, use the following command.

    python evaluation/translation/translate_en-es.py
